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Abstract 

The Lenski experiment investigates the long-term evolution of bacterial populations. In this paper we 
present an individual-based probabilistic model that captures essential features of the experimental 
design, and whose mechanism does not include epistasis in the continuous-time (intraday) part of the 
model, but leads to an epistatic effect in the discrete-time (interday) part. We prove that under some 
assumptions excluding clonal interference, the rescaled relative fitness process converges in the large 
population limit to a power law function, similar to the one obtained by Wiser et al. (2013), there 
attributed to effects of clonal interference and epistasis. 
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1. Introduction 

The Lenski experiment (see pi mm3 for a detailed description) is a cornerstone in experimental 
evolution. It investigates the long-term evolution of 12 initially identical populations of the bacterium 
E. coli in identical environments. One of the basic concepts of the Lenski experiment is that of the 
daily cycles. Every day starts by sampling approximately 5 • 10 6 cells from the bacteria available in 
the medium that was used the day before. This sample is propagated in a minimal glucose medium. 
The bacteria then reproduce (by binary splitting) with an exponential population growth. The repro¬ 
duction continues until the medium is deployed, i.e., when there is no more glucose available. Then 
the reproduction stops and a phase of starvation starts. This phase lasts until the beginning of the 
next day, when the new sample is transferred to fresh medium. Around 5 • 10 8 cells are present at the 
end of each day. 

Up to now the experiment has been going on for more than 60’000 generations (or 9000 days, see 
CZI). One important feature is that samples of ancestral populations were stored, which afterwards 
could be made to reproduce under competition with later generations in order to experimentally de¬ 
termine the fitness of an evolved strain relative to the founder ancestor of the population by comparing 
their growth rates in the following manner |18j : A population of size Aq of the unevolved strain and 
a population of size Bq of the evolved strain perform a direct competition in the minimal glucose 
medium. The respective population sizes at the end of the day, that is, after the glucose is consumed, 
are denoted by A\ and B±. The (empirical) relative fitness F(B\A) of strain B with respect to strain 
A is then given by the ratio of the exponential growth rates, calculated as 
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Considerable changes of the relative fitness have been observed in the more than 25 years of the 
experiment (PI SOU). As expected, the relative fitness of the population increases over time, but 
one of the features that have been observed is a pronounced deceleration in the increase of the relative 
fitness, see Figure 2 in [25]. In particular it has been observed that it increases sublinearly over 
time. Several questions have arisen in this context (II ESI): How can the change of relative fitness be 
explained or approximated? Which factors account for the deceleration in the increase of the relative 
fitness? 

In g], the authors perform an analysis on the change of the relative fitness for the first 20’000 
generations of the experiment, and of the mutations that go to fixation during the same period. 
They conjecture that effects of dependence between mutations, like clonal interference and epistasis, 
contribute crucially to the deceleration of the gain of relative fitness. 

In [25], the authors analyse the change of the relative fitness for the first 50’000 generations of 
the experiment, and fit the observations to a power law function. They also conjecture that clonal 
interference and epistasis contribute crucially to the quantitative behavior of relative fitness, and 
support this conjecture by sketching a mathematical model which predicts a power law function for 
the relative fitness. 

In this paper, we propose a basic mathematical model for a population that captures essential 
features of the Lenski experiment, in particular the daily cycles. It models an asexually reproducing 
population whose growth in each cycle is stopped after a certain time, and a new cycle is started with a 
sample of the original population. We include (beneficial) mutations into the model by assuming that 
an individual may mutate with a certain (small) probability and draws a certain (small) reproductory 
benefit from the mutation, which results in an increase of the reproduction rate during the cycle. We 
then calculate the probability of fixation of a beneficial mutation, and its time to fixation. Using 
this, we can prove that under some conditions on the parameters of mutation and selection, with 
high probability there will be no clonal interference in the population, which means in our situation 
that, with high probability, beneficial mutations only arrive when the population is homogeneous (in 
the sense that all its individuals have the same reproduction rate). This result implies that we are 
essentially dealing with a model of adaptive evolution, which allows a thorough mathematical analysis. 
In particular, using convergence results for Markov chains in the spirit of BE we are able to prove 
that the relative fitness of the population, on a suitable timescale in terms of the population size, 
converges locally uniformly to a deterministic curve (see Figure [2]). 

In this way we arrive at an explanation of a power law behavior (with a deceleration in the 
increase) of the relative fitness. This explanation is in terms of the experiment’s design (which makes 
the generation time dependent of the fitness level), and does not invoke clonal interference, nor a 
direct epistatic effect of the beneficial mutations (see Sec. 1.5). 

More specifically, in our model every beneficial mutation which is succesful in the sense that it goes 
to fixation, will increases the individual reproduction rate by the same amount (p, say), irrespective 
of the current value r of the individual reproduction rate. In this sense the model is “non-epistatic”. 
However, there will be an indirect epistatic effect caused by the design of the experiment: since the 
amount of glucose, which the bacteria get for their population growth, remains the same from day 
to day, a population with a high individual reproduction rate will consume this amount more quickly 
than a population with a low individual reproduction rate. In other words, the daily duration of 
the experiment (that is the time t = U during which the population grows at day i ) will depend on 
the current level r = rt of the individual reproduction rate, and will become shorter as r increases. 
Indeed, the ratio of the two expected growth factors in one day is exp((r + p)t)/ exp(rf) = exp {pt). 
Even though p does not depend on r by our assumption, this ratio does depend on r, because, as 
stated above, the duration t = ti of the daily cycles becomes smaller as r increases. We are well aware 
that clonal interference as well as direct epistatic effects will also be at work in the Lenski experiment, 
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and should be modelled. On the other hand, our results might help to separate these effects from an 
indirect epistatic effect caused by a shortening of the daily cycles as the generations proceed, which 
would go along with a quicker consumption of the daily nutrition as fitness increases. 

In the remainder of this introductory section we discuss our mathematical approach and main 
results, and put our methods into the context of related work. The formal statement of the model 
and the main results will be given in Section [2j and the proofs in Section [3j The most intricate proof 
is that of Theorem 2.9 which relies on a coupling of the daily sampling scheme with near-critical 
Galton-Watson processes that is successful over a sufficiently long time period. Some tools from the 
theory of branching processes (Yule and Galton-Watson processes) are presented in the Appendix. 


1.1. A neutral model for the daily cycles 

We build our model on few basic assumptions: Every individual reproduces independently by 
binary splitting at a given rate until the end of a growth cycle, which corresponds to one day (in 
the sense of El)- Our daily cycle model is determined by specifying the reproduction rate of each 
individual, and a stopping rule to end the growth of the population. To illustrate this we assume for 
the moment a neutral situation , i.e. all individuals have the same reproduction rate. The experiment 
is laid out such that the total number of bacteria at the end of one day is roughly the same for every 
day. This suggests the following mathematical assumptions: Each day starts with a population of N 
individuals. These individuals reproduce by binary splitting at some fixed rate r until the maximum 
capacity is reached. We assume that this happens (and that the “Lenski day” is over) as soon as the 
total number of cells in the medium is close to 7 N for some constant 7 > 1 (a precise definition and 
a discussion of the corresponding stopping rules will be given in Section 2.1). The description of the 
experiment suggests to think of N = 5-10 6 , and 7 ~ 100, since at the end of each day, one gets around 
5 • 10 8 bacteria, see supplementary material of [IS]. The subsequent day is started by sampling N 
individuals from the approximately 7 N total amount available, and the procedure is repeated. 

This setting induces a genealogical process, which we study on the evolutionary time scale, that is 
with one unit of time corresponding to N = 5 • 10 6 days. On this time scale, the genealogical process 
turns out to be approximately a constant time change of the Kingman coalescent, where the constant 
is c 7 := 2(1 — i). In this sense, N/c 1 plays the role of an effective population size. With the stated 
numbers, this is much larger than the number (ss 9000) of “Lenski days” that have passed so far. In 
other words, in the neutral model so far only a small fraction of one unit of the evolutionary timescale 
has passed. Still, this model provides a good basis to introduce mutation and selection. In fact, we 
will see that the design of the experiment (via the stopping rule that defines the end of each day) 
affects the selective advantage provided by a beneficial mutation and in this way has an influence that 
goes well beyond the determination of the effective population size in the neutral model. 

Our genealogical model arises naturally from the daily cycle setting, see Figure [I] Schweinsberg 
m obtained a Cannings dynamics by sampling generation-wise N individuals from a supercritical 
Galton-Watson forest, and analysed the arising coalescents as N ^ 00 . Our model is similar in spirit, 
with the binary splitting leading to Yule processes. We will introduce the additional feature that 
some individuals reproduce at a faster rate; in this sense Schweinsberg’s sampling approach to neutral 
coalescents is naturally extended to a case with selection. 


1.2. Mutants versus standing population 

Next we consider a modification of the previous model, supposing that at a certain day a fraction 
of the population reproduces at rate r, while the complementary fraction (founded by some beneficial 
mutant in the past) reproduces faster, say at rate r + qn, with qn > 0. Our assumptions will be that 
the increment of the reproduction rate qn is small, but not too small, more precisely we will assume 
that qn ~ N~ h for some 0 < b < 1/2 (~ denoting asymptotic equivalence, i.e. the convergence of the 
ratio to 1 as A —^ 00 ). We assume that the reproduction rate is heritable. Based on the observation 
that with the stopping procedure indicated above a “Lenski day” lasts approximately units of 


time of the Yule process, we will prove in Proposition 2.8 that the expected number of offspring at the 
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Figure 1 : Two of the daily cycles (or “days”), with N = 4 and 7 = 3 . The IV-sample at the end of day 1 
constitutes the parental population at the beginning of day 2. 


beginning of the next day of an individual with reproduction rate r + qn is increased for large N by 
approximately compared to an individual with reproduction rate r. In this sense the effective 

selective advantage of a beneficial mutation is approximately 

Let us emphasize that here one obtains a dependence on the reproduction rate r of the standing 
population due to the relation between r and the “length of a day”, i.e. the time span it takes the 
total population to reach the maximum capacity. The implication of this result is that the selective 
advantage provided by reproducing g^ units faster is comparatively large if the standing population 
is not well adapted and thus reproduces at a low rate, and is comparatively small if the population is 
well adapted in the sense that it already reproduces fast. 


1.3. Genetic and adaptive evolution 

In order to study the genetic and adaptive evolution of a population under the conditions of the 
Lenski experiment, we consider a model with moderately strong selection - weak mutation and constant 
additive fitness effect of the mutations. We assume that the population reproduces in daily cycles as 
described above, and that at each day with probability /in a beneficial mutation occurs within the 
ancestral population of that day, where pn —>■ 0 as N — > 00 . Following the ansatz described above, 
we assume that an individual affected by such a beneficial mutation increases its reproduction rate 
and that of its offspring by qn. Some of these mutations will go to fixation (in which case they will 
be called “successful”), while the others are lost from the population. Calculating the probability of 
fixation of a beneficial mutation is a classical problem, studied already at the beginning of the last 
century by Haldane in the Wright-Fisher model. These questions still have a major interest in modern 
times, and have recently been studied in different contexts (see for example [TB] or [M|). 

Assume now that the initial population on day i consists of N — 1 individuals that reproduce at 
rate r and one mutant that reproduces at rate r + QN. We will see in Theorem 2.9 that the probability 
of fixation of such a mutant is asymptotically 


Pn log 7 7 
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as ./V —» oo. A crucial role in the proof of our result is played by an intricate approximation of the 
number of the mutants’ descendants by near-critical Galton-Watson process, as long as their number 
is relatively small compared to the total population. 

In Proposition |2.12[ we prove that in a certain regime of the model parameters, namely if qn ~ 
TV - ,Miv ~ N~ a , with b £ (0,1/2) and a > 36, the time it takes for a mutation to go to fixation or 
extinction is with high probability shorter than the time between two mutation events which is of 
order (ijf 1 . This result allows us to exclude clonal interference on the time scale |_\, and to 
approximate the reproduction rate process of our original model by a simple Markov chain which can 
be interpreted as an idealized process where successful mutations fixate immediately on the scale of 
their arrival rate, and unsuccessful ones are neglected. 

In this respect, the analysis presented in this paper can be seen in the framework of the theory 
of stochastic adaptive dynamics, as studied by Champagnat, Meleard and others, see 0 0 and 
references therein. Let us emphasize, however, that we prove the validity of our approximation by 
taking simultaneous limits of the population size N —► oo, the rate of mutation /jjy —» 0, and the 
increment of the reproduction rate qn —>• 0 , which requires some care, and is carried out by taking 
the specifics of our model into account. 


1-4- Deterministic approximation on longer time scales 

The calculation of the fixation probability in Theorem |2.9| and the exclusion of clonal interference 
in Proposition |2.12[ as well as the resulting Markov chain approximation of the reproduction rate 
process are the key steps in the analysis of the long-term behaviour of the population in the Lenski 
experiment. This allows to derive the process counting the number of eventually successful beneficial 
mutations until a certain day, and the process of the relative fitness of the evolved population compared 
to the initial fitness. 

that for large N, on the time scale /% J> the 


It turns out, as we prove in Theorem 


2.13 


-N 

number of successful mutations is approximately a Poisson process with constant rate , if the 

observation of the population starts on some day where the reproduction rate is constant and equal 
to ro > 0 . 

In order to define the fitness of an evolved strain relative to the unevolved one, we assume that 
the unevolved population, taken from the first day of the experiment, is homogeneous and evolves at 
rate ro- In view of (|T]) we define the fitness of the population at the beginning of day i with respect 
to that at the beginning of day 0 as 


log * EL 


f (N) , = N 1 ' 


log e r ° l 


( 3 ) 


where Ri,j,j = 1,..., N are the reproduction rates of the individuals present at the beginning of day 
i, and u is a given time for which the two populations are allowed to grow together. (This time may 
also depend on i, which does not affect our results.) For brevity we call F, the relative fitness at day i. 

We prove in Theorem 2.14 that, under the assumptions described above and specified in Sec. 2, 
the sequence of time-rescaled processes (F^ tg - 2 ^-i^ ) t > 0 converges locally uniformly as N —» oo to the 
parabola 


m = 


I 2 y log ( 7 ) 

(7 -iK ’ 


t > 0 . 


( 4 ) 


Hence our model, which should be regarded as idealized and basic, still succeeds to describe the ob¬ 
served sublinear increase of relative fitness quite well on a qualitative level, even without incorporating 
the effects of clonal interference or epistasis. 


1.5. Diminishing returns and epistasis. 

In this subsection we summarize the heuristics which leads to the formula ([ 2 ]) for the fixation 
probability in our individual-based model, and compare it with the ansatz of Wiser et al. [29]. 
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Figure 2: The limiting relative fitness curve for N 
f(t) is given by 0. 


oo, if time is rescaled by \tQjf Hp ^\, t > 0. The curve 


Our basic assumption is that every beneficial mutation adds a fixed amount qn to the reproduction 
rate r of the individual that undergoes the mutation. When all (or nearly all) individuals that are 
present at day i have reproduction rate r, then this day ends (approximately) at time cr := 
because then e rcr = 7. Consequently, over this day the growth factor of a mutant population whose 
reproduction rate is r + qn is e (j ’ +eN ' >a , and the ratio of these two growth factors is e eN<J « 1+ BN *° s 7 , 
revealing that the selective advantage of the mutant is := eJV °S7 ,. j n the branching process 
approximation for the onset of the mutant, 1+s n is the offspring mean, while the quantity c 7 = 2(1— ^) 
that appeared already in Sec |l.l| converges for TV —>■ 00 to the offspring variance, see the discussion 
after Theorem 2.5 In view of Lemma Appendix B.l this explains the form ([2]) of the fixation 
probability. 

A related observation appears in J7j: if two populations grow (as a pure birth process) with 
Malthusian parameters r w and r m , and if one generation corresponds to a doubling of the population 
size, then a “correct measure for the dynamics of selection per generation” is (r m — r w )T, where 
T = (log2)/r w is the generation time (see [7], formula (3.2)). Our model reflects such a generation 
scheme, with logy instead of log2, due to the design of the Lenski experiment. 

It is interesting to note that our model leads to quite similar conclusions as the one proposed in 
[25], although the basic hypotheses are somewhat different. Motivated by m the authors of [251 
assume that the (n + l)-st successful mutation increases the individual reproduction rate by a factor 
1 + S n +i, where Sn+i is distributed exponentially with some parameter a n , and the distribution 
of S n + 1 is that of S n + 1 conditioned to the event that the mutation goes to fixation (surviving also 
clonal interference). They make the following assumption in order to model diminishing returns : The 
sequence a n ,n £ No, satisfies 

&n-\-l ^n(l + l))i (h) 

where g is a positive constant and {S n + 1 ) is the expected value of SVi+i- According to the 
parameter g serves to model the phenomenon of epistasis, which corresponds to a non-linearity in 
the fitness effects. Through 0 , it is a priori assumed that the expected value of the beneficial effect 
of a mutation decreases as the number of successful mutations increases. Arguing heuristically by a 
branching process approximation, the authors of |25] obtain an approximation of the relative fitness 
by the function 

w= (ct+l) 1/29 . (6) 

Here c depends on clonal interference and epistasis. In [251 the approximation is compared to real 
data, taking different pairs ( 3 , c) and proving that the power law approximation in equation ([fi]) fits 
better to data than the hyperbolic curve proposed in [?j. 
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Our Theorem 2.14 is consistent with (|6j), as we prove that, under the assumptions of our model, 

w = (c't + 1) 1/2 . (7) 

Notably, the “diminishing returns” for the case < 7=1 emerge in our model under the assumption that 
every beneficial mutation adds a constant amount qn to the intraday individual reproduction rate, 
which corresponds to the absence of epistasis in this part of the model. This shows that the observed 
power law behaviour of the relative fitness can to some extent be explained by the mere design of 
the experiment, based on a simple non-epistatic intraday model - a fact which may also be seen as a 
strengthening of the argument of Wiser et al |29] that a power law is an appropriate approximation 
to the evolution of relative fitness. 

In order to arrive at a power law § for more general g, we have to extend our model slightly. 


Indeed, in Corollary 2.15 we prove that a gain in the reproduction rate of x q qn, for some q > —1, if 


the present relative fitness is x , leads to a power law fitness curve with exponent 1 /( 2(1 + <?)), which 
compares to ([b]) by taking q = g — 1 . 

For a recent study that proposes a general framework for quantifying patterns of macroscopic 
epistasis from observed differences in adaptability, including a discussion of fitness and mutation tra¬ 
jectories in the Lenski experiment, see [T2|. We refer also to the discussion in [ 8 ] of various epistatic 
models that would explain a declining adaptability in microbial evolution experiments, and to the 
discussion in [ 22 ] concerning the evolutionary dynamics on epistatic versus non-epistatic fitness land¬ 
scapes with finitely many genotypes. 


2. Models and main results 

2.1. Mathematical model of daily population cycles 

In this section, we construct a mathematical model for the daily reproduction and growth cycle of 
a bacterial population in the Lenski experiment, and state some first results, in particular on fixation 
probabilities of beneficial mutations. These are the foundations for our main results to be presented 
in Section [2721 


2.1.1. Neutral model 

We start by introducing the neutral model, where all individuals in the population reproduce at the 
same rate. The model consists of a continuous time intraday dynamics, and a discrete time interday 
dynamics, the latter is governed by a stopping- and a sampling rule. We number the daily cycles, or 
“days” as we call them for simplicity, by i £ No- Fix N £ N, and r > 0. We assume that every daily 
cycle starts with exactly N individuals that reproduce at rate r, the basic reproduction rate. More 
precisely, we decree that, independently for every day i £ No, the (neutral) intraday population size 
process has the distribution of a Yule process , denoted by (z[ N ' > ) t > o, with reproduction parameter r , 
started with Zq N ■* = N individuals. Consequently, for every t > 0, the random variable z\ N ^ follows 
a negative binomial distribution with parameters N and e~ rt (see Corollary A.4 in Appendix A). In 
Appendix |Appendix A[ we collect the properties of Yule process that are relevant for this paper. 

Fix now 7 > 1, and define stopping times 


Cat := infjf > 0 : z[ N ^ > 7 N} 

and 


( 8 ) 


see 


a {N) := inf {t > 0 : E[Z t (A °] > 7 iV}. 

Note that Cat is a random variable, while <jW) j s deterministic. In fact, since E [Z^' 1 ] = Ne rt 
immediately that cr^ N l does not depend on N and equals 

log 7 

a =-. 

r 


(9) 


, we 


( 10 ) 
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Definition 2.1 (Neutral model). Fix N £ N, r > 0, 7 > 1. In the neutral model, independently for 
every i £ No, the population size at the end of day i is given by a copy of the random variable Z„ N \ 
where {z[ N ^ )i>o is defined above. 


In other words, at every day the neutral population is started with N individuals that reproduce 
by binary splitting at rate r (which leads to the above Yule process), with the population growth 
stopped at time a that depends on 7 and r. 


Remark 2.2 (Stopping rules). The two stopping times < 7 v and a give rise to two different stopping rules 
for the population: The stopping rule 1 stops the population growth at time qv, that is the time when 
population size has reached exactly pylV]. On the other hand, stopping rule 2 uses er instead, which 
implies that the size of the stopped population, given by Z^ N \ has a negative binomial distribution 
with parameters TV and . While Cv might be a more natural choice for the stopping time of the 
population growth, a is easier to deal with. In this paper we will work under stopping rule 2, but we 
expect the essentials of our results to be true for Qv as well. In fact, as we show in Lemma Appendix] 


A.4 Qv converges to a in distribution. 


2.1.2. The genealogy 

Before turning our attention to the model with selection, we briefly discuss the neutral genealogy. 
If we label the individuals within this process, we can keep track of their ancestral relationship by 
specifying a sampling rule. 

Definition 2.3 (Sampling rule). The parent population of day i + 1 is a uniform sample of size N 
taken from the population at the end of day i. 

Let = = 0 , 1 , 2 ,..., be a sequence of vectors such that zd- is the number of offspring 

in the population at the beginning of day i of individual j from the population at the beginning of 
day i — 1. Since (zd)i 6 N 0 are independent and identically distributed, and for each i the components 
of v l are exchangeable and sum to TV, we are facing a Cannings model, where the “days” play the role 
of generations (see [2Sj for more background on Cannings models and coalescents). We can now fix a 
generation i and consider the genealogy of a sample of n(< N ) individuals. Here, for conceptual and 
notational convenience, we shift the “present generation” to the time origin and extend the Cannings 
dynamics (which is time-homogeneous) to all the preceding generations as well. 

Definition 2.4 (Ancestral process). Sample n individuals at generation 0 and denote them by Zi, • • • , l n . 
Let V n be the set of partitions of { 1, 2, • • • ,n} and B^ N,nS) = (Bg N ’ n ' l ) ge ^ 0 be the process taking values 
in V n such that any j , k being in the same block in B g N ’ n ' > if and only if there is a common ancestor 
at generation —g for individuals lj,lk■ Then B^ N,n ' ) j S ancestral process of the chosen sample. 

It turns out that the genealogical process converges after a suitable time-scaling to the classical 
Kingman coalescent (see [2S] for a definition and more details on the relevance of Kingman’s coalescent 
in population genetics). The time-rescaling depends on the population size TV and is determined by a 
constant depending on 7 . 

Theorem 2.5 (Convergence to Kingman’s coalescent). For all n £ N, the sequence of ancestral 

processes (B^ N,n ^ , ) t >o converges weakly on the space of cadlag paths as N —> 00 to Kingman’s 

LJVt/2(l-i)j - 

n-coalescent. 

The proof of Theorem |2.5| is given in Appendix A. Here we give a brief heuristic explanation of 
the time change factor 2(1 — I /7 )/N. This factor is asymptotically equal to c 7j jv, the pair coalescence 
probability in one generation , which in turn equals the probability that the second of two sampled 
individuals belongs to the same (one generation) offspring as the first one. Hence, in the limit TV —> 00, 
c 7i jv is asymptotically equal to the ratio (EG — 1)/(TVEG), where G is the one-generation offspring 
number of a single individual, and G is a size-biased version of G. If G has a geometric distribution 






with expectation 7 (which is the case in our setting, as can be seen from Lemma Appendix A.3 in the 
Appendix), then EG = EG 2 /EG = 27—1, and hence c 7j at ~ 2(1 — d)/TV. (In particular, for large 7 , 

G /7 is asymptotically exponential, EG ~ 2EG, and c 7i jv ~ j^.) 


2.1.3. Including selective advantage 

We now drop the assumption that the relative fitness is constant over the whole population, and 
include some selective advantage. Fix r > 0,7 > 1 as before. For TV £ N let qn > 0. Throughout this 
paper, we will assume that the sequence (qn)n£N satisfies the condition 

3b £ (0,1/2) : qn ~ N~ b as TV —» 00 . (11) 


We extend our basic population model in the following way. Assume that at day i a number k 
among the TV individuals of the initial population have a selective advantage in the sense that they 
reproduce at rate r + qn, and the remaining TV — k individuals reproduce at rate r. We call the 
selectively advantageous individuals the mutants , and the others the wild-type individuals. We assume 
that fitness is heritable, meaning that offspring (unless affected by a mutation) retain the fitness of 
their parent. The intraday population size process at day i is then of the form 

Y t := Y t {N ’ k) = M t {fe) + Z { t N ~ k \ t > 0, (12) 

where (z\ N k ^)t >0 is a Yule process with reproduction rate r, started with Zq N ^ = TV— k individuals, 

while (M t (fc) ) 00 is a Yule process with reproduction rate r + qn, started with Mjf = k individuals, 

and independent of {z[ N k ^)t>o- Note that for fixed r and qn the distribution of {Yt)t> 0 is uniquely 

(k) 

determined by the initial number Mq = k of mutants. 

We apply stopping rule 2 to this model, which translates into stopping population growth at a 
deterministic time depending on k (and TV), namely at 

a k ■= <j[ N) = inf{t > 0 : E[Y t ] > 7 TV}. (13) 


This is still a deterministic time, though somewhat harder to calculate than ex, which equals 00 in this 
notation. Due to our construction, at the end of day i the total population has size Y ak , among which 
there are mutants, and k ■* wild-type individuals. 


One of the main tasks of this paper will be to calculate the number of mutants at the beginning of 
day i, for i € N 0 . As sum ing that we know the population Y ak = M^} + Z^l~ k ^ at the end of day i — 1, 


we apply Definition 
TV out of the M 


2.3 


(k) 

which means that given Mf k = M, and Z c 


J G u 

AN-k) 


= Z. we sample uniformly 


Z individuals. Denote by A/ the number of mutants contained in this sample. 
Fixing Kq and repeating this independently for i £ N defines the interday process (Ki)iew 0 counting 
the number of mutants in the model with selection at the beginning of each day. Summarizing, this 
process can be described as follows: 


Proposition 2.6 (Model with selection). Fix 7 > l,r > 0 and pat, TV £ N satisfying Fix Kq £ 
{1,..., TV}. Assume I\i-i has been constructed, and takes the value k. Let M follow a negative binomial 
distribution with parameters k and e~^ r+eN ' ><7k , and let Z follow a negative binomial distribution with 
parameters N—k and e~ r<Tk independent of M. Conditional on M and Z , the number K k is determined 
by sampling from the hypergeometric distribution with parameters TV, M and M + Z. 


Proof. This follows from the construction, noting that ( M t )t>o and {Zt)t> 0 evolve independently until 
the deterministic time <r k , and recalling that sampling TV individuals without replacement out of M 
of one type and Z of another type is described by the hypergeometric distribution. □ 
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Remark 2.7 (More than two types). The definition of the model with selection generalizes in an obvious 
way to situations where there are more than two different types of individuals in the population. If 
there are £ different types reproducing at £ different (fixed) rates, the population within one day grows 
like £ independent Yule processes with suitable initial values and reproduction rates, the stopping time 
is defined accordingly, and the sampling remains uniform over the whole population. 

Since the mutants reproduce faster, their proportion will increase (stochastically) during the day. 
Hence, sampling uniformly at random from the population at the end of day i we expect to sample 
more than the initial number of mutants, meaning that the fitness of the population will increase over 
time. 

Proposition 2.8 (Selective advantage). Under assumption 

K[Ki\Kq = 1] — 1 ~ qn —— as N —► oo. (14) 

r 

Under the condition {I\ 0 = 1 } the N — Ki wild-type individuals that are sampled at the end 
of day 0 are exchangeably distributed upon the N — 1 wild-type ancestors that were present at the 
beginning of day 0. Hence, the expected (sampled) offspring of each of these wild-type ancestors is ~ 1 
as N —> oo, and thus, in view of Proposition |2.8[ we can say that the selective advantage of a single 
mutant, resulting from the increase of its reproduction rate from r to r + qn, is given by qn^P-- 

The main result of this section concerns the fixation probability of a beneficial mutation affecting 
one individual at the beginning of day 0 , and an estimate of the time that it takes for a successful 
mutation to go to fixation (or for an unsuccessful mutation to go extinct). Let 


n N := F(3i e N : K t = N \ K 0 = l) (15) 

denote the probability of fixation if the population size process is started with one mutant at day 0 
and write 

(16) 


Tfa := inf{i > 1 : K t = N} G [0, oo] 


for the time of fixation, and 


r ext : = inf {* > 1 : Ki = 0} G [0, oo] 


(17) 

for the time until the mutation has been lost from the population, with the usual convention that 
inf 0 = oo. Let 

T N n a n 

^ ■ Tx '' r ext 

be the first day at which either the whole population carries the mutation, or there are no more 
individuals in the population carrying the mutation. Let 

7 log 7 


<7(7) := 


7-1 


(18) 


Theorem 2.9 (Probability and speed of fixation). Assume (11), and assume that a mutation affects 
exactly one individual at day 0, and that no further mutations happen after the first one. Then as 
N —>• oo, 

< 7 ( 7 ) 


T/v ~ Qn~ 


(19) 


Moreover, for any 5 > 0 there exists Ns € N such that for all N > Ns 

F(t n > Qj, 1 - 35 ) < (7/8)^\ (20) 

The proof, which will be given in Section [3j relies on a comparison with a supercritical (near- 
critical) Galton-Watson process in the “early phase of the sweep”. While the basic idea is classical 
(dating back to work of Fisher from the 1920’s), the scaling (11) of the supercriticality and the specific 
nature of our Cannings dynamics required new arguments and a delicate analysis. For related results 
on near-critical Galton-Watson processes (which in some parts inspired our reasoning) see the recent 
work of Parsons [24] . 
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2.2. Genetic and adaptive evolution 

Our ultimate goal is to understand the deceleration of the increase in the relative fitness observed 
in [ 25] , in particular as compared to the linearly increasing number of successful mutations (“adaptive 
versus genetic evolution”). In our model the relevant scales for the two processes turn out to be 
different, since the assumptions are such that many successful mutations are needed in order to have 
a change of approximately one unit in the relative fitness. 

This section is divided into two parts. First, we study the model on a short time scale, which 
is the relevant one for the arrivals of successful mutations. We prove that under some assumptions 
on the model parameters the number of successful mutations converges on a suitable time scale to a 
standard Poisson process. Afterwards, we introduce the process of relative fitness of the population, 
and we show that this process converges on a longer time scale to a deterministic function. 


2.2.1. Genetic and adaptive evolution on a short scale 

The assertion of Theorem |2.9| can be rephrased as follows: In a background of wild-type individuals 
that reproduce at rate r, a beneficial mutation that leads to a reproduction rate r + gjv has a probability 
of fixation obeying (191. Besides recalling condition on the selection, in the following assumption 
we require that the mutation rate is small enough to exclude “effective clonal interference” between 
beneficial mutations. 


Assumption A (Additive, moderately strong selection-weak mutation). Beneficial mutations occur 
and act in such a way that the following hold: 

i) Beneficial mutations add qn to the reproduction rate of the individual that suffers the mutation. 

ii) In each generation, with probability p^ there occurs a beneficial mutation. The mutation affects 
only one (uniformly chosen) individual, and every offspring of this individual also carries the 
mutation. 

in) There exists 0 < b < 1/2, and a > 3b, such that p^ ~ N~ a and qn ~ N~ b as N — >• oo. 


We use the term moderately strong selection in order to indicate that the strength of selection in 
our model is between what is generally called strong selection , where qn = 0(1), and weak selection 
where qn = 0(N~ 1 ) as N —► oo. Models with such types of selection were recently considered in 
the context of density dependent birth-death-mutation processes by Parsons [2U [25]. The term weak 
mutation is used to indicate that the mutation rate is small enough to guarantee the absence of clonal 
interference as N —> oo, which we will prove in Proposition |2.12~| 

Definition 2.10 (Interfering mutations, clonal interference). Consider a pair of successive mutations. 
Recall that t n denotes the first time after the first mutation at which the individual reproduction rate 
is constant within the population. Denote by ni]\[ the time of the second mutation. We say that the two 
mutations interfere ifnriN < t n , and that clonal interference occurs if there exists a pair of interfering 
mutations. In particular, there is no clonal interference until day i if there is no mutation starting 
until day i that interferes with any other mutation. 


Remark 2.11. (i) As we will see in Proposition 2.12 below, Assumption A iii) guarantees that the 
probability of clonal interference of any pair of successive mutations is of order at most PnQn ■ In 
particular, this ensures that the probability of not observing any event of clonal interference on a time 
scale of order pj^gf) 2 (which we will see to be relevant for our model) tends to 1 as N —» oo. 

(ii) Our assumption A iii) is somewhat stronger than requiring pjy <C Qn , which is a standard as¬ 
sumption in adaptive dynamics excluding clonal interference, see e.g. [B]. In view of Theorem 2.9 and 
of our detailed calculations in Section 3 we think that replacing a > 3b by a > b in Assumption A iii) 
should still lead to the same results. However, there are substantial technical difficulties to consider 
in this case, since a > b only excludes clonal interference of two successive mutations, but not on the 
longer time scales that are relevant for our results. 
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(iii) While there is little doubt that there is clonal interference (of successive beneficial mutations) in 
the Lenski experiment [21], it is noticeable that, as will be seen in Theorem 2.14 in order to qualita¬ 


tively explain certain features of the experimental results on the relative fitness of the population, it 
is not mandatory to include clonal interference as a model assumption. Including clonal interference 
into the model will be one goal of our future research in this topic. 

Proposition 2.12 (Probability of clonal interference). In our model, for any S > 0 there exists 
Ng £ N such that for all N > Ns, 


P(m N < t n ) < p N g N 

In particular, under Assumption A iii), for any T > 0, 


-i -s 


lim P(no clonal interference until day [g N ~ ) = 1. 

N —^oo 


( 21 ) 


A quantity of interest is the number of successful mutations up to a given day. Let Hi denote the 
number of eventually successful mutations that have started until day i, with Hq = 0. Since mutations 
arrive independently at rate /i/v, and fixate with probability ~ C ^ g eN (at least in the absence of 

clonal interference), we expect that successful mutations arrive at rate BN . Indeed, Proposition 

|2.12| allows us to make this rigorous. 

Theorem 2.13 (Process of successful mutations). Let Hi, i £ N, be the number of successful mutations 
initiated until day i, with Hq = 0. Let ro > 0 be the reproduction rate of the population at day 0, and 
let (M(t))t >o be a standard Poisson process. Under Assumption A, for any T > 0, the process 
o <t<T converges in distribution (with respect to the Skorokhod topology on the space of 
cadlag paths) to {M(^p-t)) Q<t<T . 



Figure 3: The fitness process F, (solid black line), started at fitness x, depicted until the time of fixation of the 
next successful mutation, in the absence of clonal interference. The light grey line represents the approximation 
4>; defined in (221. 


2.2.2. Genetic and adaptive evolution on a long time scale 

Our next goal is to investigate the process describing the fitness of the evolved population relative 
to the ancestral population at day 0. Let R,j, for * £ No and 1 < j < N, denote the reproduction rate 
of individual j at the beginning of day i. Assume that at day 0 every individual has reproduction rate 
ro, that is, Rqj = ro for all j = 1, ...,7V. Recall from ([3]) the definition of the relative fitness at day 
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i with respect to day 0. We can connect the relative fitness with the number of successful mutations 
in the following way. Let R, := min-| <j<N Rij and Ri := maxi <j<N Ri,j denote the minimal and 
maximal reproduction rate at day i. respectively. Then we have 

Ri n Ri 

< Ti < , i £ No- 

ro r 0 

Moreover, on the event that there is no clonal interference up to day i, one has 


r 0 + QN{Hi - 1) < Ri < Ri < r 0 + g n Hi. 


Let 


:= 1 + 



Thus on the event that there is no clonal interference we have 




Bn 

ro 


<Fi< 


( 22 ) 

(23) 


From Theorem 2.13 we see that the relevant time scale for the successful mutations is given by 


2.8), in 


Rn^Bn 1 - Since the selective advantage of a single mutation is of order qm (cf. Proposition 
view of (22) it seems plausible that the time scale on which to expect a non-trivial limit of the fitness 
process is QnRn- This suggests that the relative fitness has to be considered on a time scale different 
from that of the number of successful mutations. 

Indeed our next theorem shows that the process F := B ~ 2 t ^ )t>o has a non-trivial scaling 

limit, which turns out to be a deterministic parabola. 


Theorem 2.14 (Convergence of the relative fitness process). Assume Rqj = Vq for j = 1, ...,7V, and 
let (Li)j g pj 0 be the process of relative fitness. Then under Assumption A, the sequence of processes 
t>o converges in distribution as N —> oo locally uniformly to the deterministic function 


m = 


2 C( 7 )i 


t > 0. 


The proof of this theorem will be given in Section|3.11| It relies on the fact that due to Proposition 


2.12 the relative fitness process (Fi) ie ^ 0 can be approximated by the process (<P;) i6 N 0 defined in (22). 


A similar result can be obtained if a beneficial mutation provides an advantage that depends on 
the current fitness level. For example, let us assume that a mutation that goes to fixation when the 
relative fitness is x , provides an increment to the reprodution rate that is of the form 


B^n = V’C x )Bn 


(24) 


for some continuous function ip : [l,oo) —> R + . As we will see in the next corollary, a special choice 
of ip as a power function leads to a fitness curver similar to 

Corollary 2.15. Under Assumption A and (24), let Ff be the relative fitness of the population at 
day i with respect to the ancestral population at time 0. Then the process (F^ 2 MJV )-itj)t>o converges 
in distribution and locally uniformly as N —» oo to the deterministic function h which is the solution 
of the differential equation 


h(t) = 


*Kh{t)?C{ 7 ) 
r 2 0 h{t) 


, h(0) = 1, t > 0. 


In particular, if ip(x) = x q for some q > —1, then 

2(1 + ?)C(7) A so+i) 


h(t) = (l 


t > 0. 


( 25 ) 


This is similar to the family of curves found in [29], see also the discussion in Section 1.5 
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3. Proof of the main results 


In this section, we provide the proofs of the results that we stated in Section [2} in particular 
Theorem 2.9 which is technically the most involved and requires several preparatory steps, which are 


carried out first. After these preparations, the proof of Theor em |2.9| will be carried out in Section T8 
The proofs of the other main results will be given in Sections |3.9| through |3.11| 

It turns out that if the number of mutants reaches at least eN, for some e £ (0,1), then the 
mutation will fixate with probability tending to one as N —» oo. Our strategy for proving Theorem 


I2.9l is thus to divide the time between the occurrence of a mutation and its eventual fixation into three 
stages. For the case of a successful mutation this is depicted in Figure [4] 



Figure 4: A sketch of the frequency of mutants during a selective sweep going to fixation. One distinguishes 
3 parts: when the number of mutants is at most eN (phase 1), when the number of wild type individuals is 
at most eN (phase 3) and the intermediate stage (phase 2). This subdivision of a selective sweep into three 
phases is a now classical approach, see for example [141 IjJ. :5]. 


The first stage starts at the day of the mutation, and ends at the first day i € N that the number 
I\i of mutants has reached a level eN, for some e £ (0,1/2). The second stage starts upon reaching eN, 
and ends when the process (K,) ieNo reaches (1 — e)N. The last stage is between (1 — e)N and N. We 
will use different methods to analyze the behaviour of the process during these three stages. The first 
stage is the most difficult to deal with, and we use a coupling to suitable Galton-Watson processes to 
show that the probability that (A/), e t^ 0 with Kq = 1 ever reaches eN is approximated by (191. The 
second stage can be treated by a simple ODE approximation, from which one sees that if Ki > eN at 
some time i, then with probability tending to 1 (as N —> oo) the process will eventually reach level 
(1 — e)N. The third stage will be dealt with in a manner that has some similarities to the first stage, 
observing that starting from at least (1 — e)N mutants, there is always a positive probability to reach 
fixation in the next step. Moreover, our methods of proof will also show that with high probability 
each of these stages will not last longer than p/ r 1- ' 5 , for any S > 0. 

To be more specific, fix 0 < e < 1/2. Assume K 0 = 1. Let 


T? := inf{i : I<i > eN}, 


and 

T.f := inf{* : K, > (1 - e)N}. 
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(26) 


Then we can write t^ x as the sum 


_iv rjijy | (r-pj\J rpjy \ , / /V /7-iiV \ 

t 6x = T + ( J 2 - J 1 ) + ( r fix- J 2 )• 

The important intermediate steps of the proof, dealing with , (T^ — T-fi ), and ( 17 / — T^), 


re¬ 


spectively, are given below in Sections 3.5[ 3.6 and |3.7[ after some preparatory steps in Sections 3.1 
through |3.4| The proof of Theorem |2.9| is completed in Section 3.8 


Assumption and notation.. Throughout all of Section [3] we fix r > 0,7 > 1, and work under the 
assumption (11), fixing b £ (0,1/2) accordingly. We a priori assume £ £ (0,1/2), but note that in 
some places we will impose further conditions. Unless stated otherwise, P*,, E&, and var*,, k £ N, refer 
to the law, expectation and variance of (RAeNo’ started at Kq = k , or any random variables defined 
on the same probability space. We use c, c',c,... to denote generic constants which are independent 
of N, with possibly different values at different occurrences. 


3.1. A simplified sampling and construction of the auxiliary Galton-Watson processes 

The construction of our model (as explained in Section |2.1.3[ ) was such that K i+ 1 was obtained 
from I\i by letting two independent Yule populations with initial sizes K t and N — Ki and respective 
growth rates r + qn and r evolve until time (Jk, (defined in ( |13[ )) and then sampling uniformly N 
individuals from the total of those two populations, which amounts to a mixed hypergeometric sam¬ 
pling of the number of individuals (Proposition |2. 6 ) . In order to simplify the picture, we would like 
to use binomial rather than hypergeometric sampling, i.e. sampling individuals independently of each 
other with equal probability. In this way we will manage to construct two Galton-Watson processes 
(K.i)ie N 0 and (A'j)jgN 0 that will serve as upper and lower bounds for our true process in the 

first stage of the sweep. We prepare this construction by first giving an alternative description for the 
sampling of mutants. 


Consider the population at the end of a given day (day 0, say). Assume K 0 = k, hence by 
construction at the end of day 0 there are M„ k mutant individuals for which we want to determine 
whether or not they will be sampled for the next day. (Recall the definition of M t from (12).) Label 
these mutant individuals with numbers 1,... ,M (Jk . Define 


Xj 1 {individual j is selected}’ / 1, • • •, M rJk . 

Define a random variable 

r := Y ^. (27) 

Thus T is the ratio between the number of individuals at the end of day 0 and the number of individuals 
at the beginning of day 1, and by E[r] = 7- Moreover, T > 1, and P(T > 1 ) is exponentially 
close to 1 as N —> 00 . Conditional on T, for every j = 1, ...,M ak , 

nxj = 1) = ^ 

but due to our sampling mechanism, the Xj , j = 1,M ak , are not independent. Their joint law 
conditional on T and M ak can be described as follows. Let (Uj)j^ be i.i.d uniform random variables 
on [0,1]. Let X\ := l{;/i<i/r}j and define recursively for j > 2 


Xj := 1 . 


Ui< 


N-J2 


XX *1 X ■ 
-W-1) I 


(28) 


For later convenience we define Uj and Xj for j £ N, even though Xj is defined only for j = 1,..., M ak . 
Lemma 3.1. Conditional on T, (Xj)j = i >mm , t M„ k is equal in distribution to (Xj)j—■ 
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Proof. Conditional on F, we can represent the sampling procedure as follows: Individual 1 has prob¬ 
ability 1/r of being selected. For individual 2, the probability of being sampled depends on whether 
or not individual 1 was selected, in fact 

H** = !) = =!) + YW^i nXl = 0) ’ {29) 

or equivalently 

AT _ V 

F(X 2 = 1\X 1 ) = yn -±. (30) 

Proceeding thus recursively, we find that the probability that the jth individual is selected, conditional 
on knowing X\, ..., Xj— i, is 


N — V j_1 X 

nXj = llX^.^Xj.r) = rjV _ ( ^ 1} ; =P(*j = l|Xi (31) 

This completes the proof. O 

We can now construct the auxiliary Galton-Watson processes. Fix a > 0. We are going to specify 
a joint transition mechanism for (A'j)j g N 0 and the auxiliary processes (K_i)ie N 0 and (Ki)ie N 0 - To this 
purpose, let k, k, k be natural numbers. Grow independent Yule trees at rate r + gjy up to time oq, and 
number these trees by l = 1, 2 .... Number all the individuals in this forest at time <To by j = 1,2,... 
and denote the j’-th individual by Ij. Let be a sequence of independent uniformly on [0,1] 

distributed random variables, independent of the Yule processes. For j £ N define 


x j 1{c/ 3 <i/7+JV-“} 


and 


Xj ^{U D <lh-N—}- 


Also, define T as in (27), and Xj by (281. We put 


L := (j : Xj belongs to the first k trees and is born before time a |- eJV ] , and X • = 1}|, 
L := | {j : Ij belongs to the first k trees and is born before time ct/c, and Xj = 1} | 
L := \{j : Ij belongs to the first k trees and is born before time do, and Xj = 1}|. 

Let 

J := int i 1 ■ rv E (,Ti) e K ' [ Y \ + N ~° ] }■ 

By construction it is clear that for every j < J 

3 3 3 

I —1 l —1 l —1 

Thus if k < k < k, on the event {J > M ak } we have 



L < L < L. 


(32) 


(33) 


(34) 


Definition 3.2. Let (FQ, K), Kf)j^ 0 be a Markov chain whose transition probability from ( k,k,k ) is 
the joint distribution of (L, L, L) given by (32). 
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By construction, the coordinate processes {Ki)ien 0 > (-Ki)ieN 0 and (-K’i)teNo are also Markov chains. 
We note in particular that because of Lemma 3.1 the dynamics of (Ki)i^ 0 is the same as that de¬ 
scribed in Proposition |2.6[ 


We will show in the next section that if a G (b, 1/2), then P(J > M ak ) is exponentially close to 
one for any k < sN. From this we will deduce that with high probability, for AT 0 = Kq = Kq = 1, we 
have 

K t < Ki <Ki \/i < if. (35) 

Note that by definition 


Ki < K u 


Vi G N 0 


(36) 


always holds. The following characterization of {Ki)ie N 0 an cl (A',). ie n 0 is immediate from the con¬ 
struction: 


Proposition 3.3. Let a > 0 as before, and {Ki, -fQ, A',;), g pj 0 as in Definition 3.2 Then {Ki) ie jq 0 
is a Galton- Watson process whose offspring distribution is mixed binomial with parameters M and 
d +N~ a , where M is geometric with parameter e~( r+eN 'l a °. Similarly, {Kfiie N 0 a Galton-Watson 
process whose offspring distribution is mixed binomial with parameters Af and d — N ~ a , where M is 
geometric with parameter e~ ( r +ejv)°T«J'n . 

3.2. A Galton-Watson approximation 


A crucial role in our analysis of stage 1 of the sweep will be played by equation (351, which we 
are now going to prove. Let b be such that (111 holds, and assume Kq = k for some k < eN. We will 
show that if a > b, then with sufficiently large probability J > N, and M„ k < N. The first part will 
require some work. To start with, we will work with a slight modification of J. Let 

■ N ~ Eti Xi 


J := 


:= inf | j ■ 


r n - {j - 1) 


\ \- - ~N~ a - + -N~ a 

' u r 2 ’r 2 


]}■ 


(37) 


Lemma 3.4. Let a G (6,1/2). There exists a constant c independent of N such that for N large 
enough, 


J > N 


k-rl < -N~ 
1 1 1 - 2 


> 1 - 2e 


—cN 


(38) 


Proof. Let Ar := {jy — r| < \N a }. By the construction and the definition of Xj, equation (38) is 
equivalent to 


^ 1=1 — G [- - -N~ a - + -N~ a ]\/j G {1 N} 

WN-{j-i) [ r 2 ’ r 2 J 3 1 ’ 5 1 


A r > 1 - 2e 


—cN 


Now rearranging the terms one gets that for 0 < j < N — 1 


r 2 ~ tn — j - r 


-N~ 


is equivalent to 


- -A 1/2 -“(r^ —) < —= V (Xi - -) < -N 1/2 ~ a (r- —). 
2 v T> - 2 v N> 


(39) 


(40) 


( 41 ) 


So our aim will be to show that with sufficiently large probability on the event Ar 


sup J2{X t - h} < \n^-«{t - 1) 
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and 


inf 


(7, £(*,-/)} 


jG{0,l,2,...,iV-l} l v /]V 

Due to our assumptions, we can consider {Xj)j- o,...,jy-i resp. (±Lj)j=o,...,N-i instead of {Xj)j=o,...,N—i- 
Indeed, since 7 , T > 1 we have on the event Ay 


I- - h < k-rl < -N~ a . 


(42) 


Then 


\- - -N~ a ,- + -N- a ) C - N~ a ,- +N~ a ], 

1 r 2 r 2 j_l 7 7 J 


which implies that on the event Ay , (34) is valid for every i < J. We recall the independence between 


Ay, X, X. Thus we are done if we show 

P( 


SU P {' -7W f )> ^ V /2 ~“( r 1) 

ie{o,i, 2 ,...,iv-i} vN 1 z 


Ay ) > 1 — e 


- cN 1 


and 


i J ' 1 1 

"f inf { — Vpr, - -)} > -N 1/2 ~ a (-T +1) Ay) 


> 1 - e 


-cN L 


(43) 


(44) 


This is an application of large deviations for maxima of sums of independent random variables, see 
for example [I]. Observing that E[Xy] = 7 + N~ a and var(Xj) = 7 _1 (1 — 7 _1 + 0(N~ a )), we obtain 
by a direct application of Theorem 1 of [lj that for any A > 0 there exists C\ = C\(A, 7 ) G (0, 00 ) such 
that 

j 

. I _ , _ \ _ , ,1 — On 

(45) 


-1 J -1 

( sup {-^VpC- N~ a )} > AN 1 ' 2 - 01 ) < e~ £lNl ~ 

' je{o,i, 2 ,...,jv-i} VTV“ 7 J 


Then (43) follows with (42). Similarly we obtain (44). 


□ 


Corollary 3.5. Let a € (6,1/2). There exists a constant c independent of N such that for N large 
enough 


P(j > N) > 1 -e 


-CAT 1 " 20 


(46) 


Proof. Recall from the proof of the previous lemma that if |P — y| < a then 

[ 1 - 1 N~ a , 1 + -N~ a ] C [- - N~ a , 1 + N~ a ], 

T 2 ’ r 2 J - l 7 ’7 J ’ 

which implies that in this case J < J. We already observed that | P — p| < (7 — T|, so it remains to 
show that I 7 — T| < ^N~ a with large probability. Indeed, for l = 1, 2,..., IV and N large enough 

p(|r- 7 |<^v-“) = pO^^I <V/ 2 -«) 


1 — e 


- c'N 1 “ 2q 


for some constant in c' independent of TV, where the last inequality follows from a generalisation of 
Cramer’s theorem, see Theorem 2 of |26] (note that 07 is a sum of independent but not identically 
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distributed random variables). Let c be a constant independent of N such that c > max(c',c), where 
c is the constant from Lemma [3.4[ For N large enough 


°(j>iv) > p( 


> 


7 i A 

1 1 


1 




> 1 - 2e~ £Nl ~ 2a - e ~ c ' Nl ~ 2a > 1 - e~ cNl 


□ 

Lemma 3.6. Let a £ (6,1/2), and 0 < e < 1 / 7 . Assume K_ 0 < K 0 < K 0 and K 0 = k, for some 
k < eN. There exists c > 0 independent of N such that for all N large enough, 


P (M ak < N) > 1 — e~ cN . 


Proof. Let Gj be the number of offspring of the mutant number j < k < eN at the end of the day, 
namely at time crfc. By construction they are i.i.d. with finite second moment. Let be i.i.d 

random variables equal in distribution to G\. Note that E[Gi] < e^ r+BN ^ a ° = 7 (l+o(l)). Since e < I /7 
we can choose N large enough such that E[Gi] < 1/e. Then 

k eN 

P(M ak < N) =P(J2Gj < N) > P( G' < N) > 1 - e~ cN 
j=i i =1 


for a suitable c > 0. The last inequality follows from Cramer’s Theorem, since eE[Gi] <1. □ 

Recall that T-f 1 = inf{* > 1 : K, > eN}. 

Proposition 3.7. Let a £ (6,1/2) and 0 < e < I/ 7 . Assume K_ 0 < I\q < K 0 and Kq = k < eN. 
Then there exists c independent of N such that for N large enough 


p (^mi„ > K min > K mi n(i, T ”) >< 3 ) > (1 ~ 2 e cNl ) 9 for all g £ N 0 . (47) 


Proof. Corollary 3.5 implies that P(K.i < K\ < K t M„ k < N ) > 1 — e 
we have 

which implies 


-cJV 1 - 20 


. Thus by Lemma 


3.6 


P(Ki < Ki <K i) > 1 - 2 e 


—cN 


(48) 


P (^min (g,Tf) > ^min(g,Tf) — i^niin(s,T 1 JV ) I ^min(i,T 1 N ) > -^min(*,T 1 Jf ) > /Lmin(i,T 1 JV ) i' 7 ’* < 3 _ l) 


> 1 - 2e" cArl 2 “. (49) 


From (491 the result follows easily by induction: Assume that (47) is true for g — 1. Then 


P (-^min (i,Tf) — ^min(j,T 1 w ) ^ /b m in(i,T 1 JV ) ’ — 9 

= P (-fr min(s,X’ 2v ) > -^min(g.T^) > =Kjnin( s ,T 1 N ) |-^min(i,TA) — -^min(i,T 1 JV ') > i ^ < 3 — l) 

X I 1 (^min(i,Tf) > ^min^.T”) — i P min(i,T 1 N ) ’ ^ — 3 — l) 


> (1 - 2 e 


—cN 


*)(l-2e 


—cN 


3-1 


D 
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3.3. Asymptotics of the stopping rule 

In order to put the Galton-Watson bounds to use, we need some control on er^. 


Lemma 3.8. Under the assumptions of this section, for any k = 1,2, ..., N, 




log 7 

r + kg N /N 


jyO(g 2 N ) + 


N 2 


0(g 2 N ). 


( 50 ) 


where \0(g 2 N )\/g 2 N is bounded uniformly in N and k. 


Proof. Note that ^^ = &n < cr*, < do = for all k = 0, ...,1V. Hence limjv->cx> <Jk = for all 


k. We assume that N is large enough such that 
By (Il2|) and (131 we have 


^ <a k <^ 

2 r — o- — r 


7 1V = E [M<£>] + E = ke {r+BN)ak + (N - k)e rak . (51) 

Hence <Jk satisfies the equation 

jN = e rak {ke BNak +N- k). (52) 

Dividing by N, taking logarithms on both sides, and using Taylor expansion first on the exponential 
and then on the logarithm leads to 


k k 

log7 =rvk + log (1 + —g N <Jk + -^0(g%)) 

=rr7k ffSNVk + —0(g 2 N ) + -^pO(g 2 N ). 


(53) 


Here we use the fact that < Uk < l -^r~ for all k if N is sufficiently large. Rewriting, we get the 
desired expression of <7k- □ 

We will use this mostly in the following form, which is an immediate application of Lemma |3.8| 

Corollary 3.9. For any k = 1,2,. .., N, as N —► 00 

e (r+eN)v k = 7 (l + (l- log 7 + 0(g%)) 

where \0(g 2 N )\/g 2 N is bounded uniformly in N and k. 


3-4- Asymptotics of the approximating Galton-Watson processes and Proof of Prop. \2.S\ 

We can now calculate the asymptotic expectation and variance of our auxiliary Galton-Watson 
processes. 


Lemma 3.10. Let a G (6,1/2). Let (Ki)ieti 0 an d {Ki )ieN 0 
Kq = Kq = 1. We have 


be as defined in Section 


3.1 


with K _ 0 = 


Ei[ifi] — 1 H- ^~Pn + o(pn) Ei[7V 1 ] — 1 H-^-(1 — s)pn + o(pjv), (54) 

and 

vai'i[A"i] = ———(1 + 0{g N )) vari^] = ———(1 + 0(g N )). (55) 
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Proof. Recall M, M from Proposition 3.3 By construction, and from Corollary 3.9 


Ei [iki] = (l/y — N~ a )E[M] 

= (I/7 - jV-«) e ( r +ejv)o- reN1 

= 1 + ^^(1 - e)g N - 7 AT-« + 0 (q n ) 

r 

= lH - ^-(1 — £)qn + o(qn) 

r 

where the last equality follows from the fact that our assumptions imply that N~ a = o(qn). I11 the 
same way we obtain 

Ei [K 1 ] = 1 H-+ o(qn)- 

r 

It remains to calculate the variance 

vari [jf]j = Ei [vari [K ± \ M] \ + vari [Ei [K 1 \ M] \ 

= Ei([M(- - N~ a )(l - - + N~ a )] +vari[M(- - N~ a )] 

= (1 _ N~ a )(l — — 4- _|_ (1 _ _/V'- a ) 2 ( e 2 ( 1 '+ew)CT rei vi _ e (r+g N )a teN i ) 


Plugging in Corollary 3.9 simplifying and taking into account that N a = o(qn) for a > b leads to 

van^r] = 2(7 ~ 1} (1 + (1 - £)qn— + o(qn)) = 2(7 ~ ^ + 0(g N ). 

7 v r 7 


The same steps lead to vari[Ai] = 2(7 — l)/y + O(g^). 


□ 


Remark 3.11. (i) This result together with Lemma 3.6 proves Proposition |2.8| (ii) Applying Lemma Ap- 
jpendix B.l from the Appendix shows 

F((Ki) survives) ~ ^-^-Qn 


and 


P((A)j) survives) ~ —— g N _ 


Corollary 3.12. Under the assumptions of Lemma \3.1C\ for k < eN , as N -7 00, 

P k{(Ki) survives \ (Kf) survives ) = Pfc((iQ) dies out | (Ki) dies out) = 1. (56) 

Further, 


and 


P k((K.i) dies out \ (Ki) survives ) < e(l + o(l)), 
P k((Ki) survives \ (K_f) dies out ) < e(l + o(l)). 


(57) 


(58) 


Proof. The first equation follows immediately from ( |36| ). We prove (57), (58) follows similarly. Let 
c(l,r) := • Note that 


E k((K.i ) dies out | (Ki) survives ) = 


P k((ff_i) dies out) — P k((Kf) dies out) 
Pfc((A'i) survives) 

(1 - c(7 ,r)(l - e)g N ) k - (1 ~ c(^,r)g N ) k 
1 - (1 - c('y,r)g N ) k 


( 59 ) 
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Let g(k) be the r.h.s of (591. We will show below that g is decreasing in k if N is large, from which 
the statement follows, observing 

g{k) < g{ 1) < e(l + o(l)). 

To prove the monotonicity of g(k), let a = c(j, 7')bn- Let N large enough such that 0 < a < 1. 
Assume that k > 1 and k £ K + . Then we can differentiate log(l — g(k)) in k which yields 


G?log(l — g(k)) (1 — a) k log(l — a) (1 — a + ae) k log(l — a + ae) 

dk 


1 — (1 — a) k 1 — (1 — a + ae) k ^ 

The function * is a decreasing function in x, for 0 < x < 1, as can be seen by differentiation. 

Apply this to the r.h.s of (60), we obtain dl ° e ^ 1 d f 9< ' k ^' > > 0 for all k > 1. This implies d9< ' k ^ 
g(k) is decreasing in k. 


dk < °- So 
□ 


3.5. First stage of the sweep 

With these preparations we can now address the first stage of the sweep, cf. Figure[4] We are going 
to calculate the probability that the number of mutants reaches eN for some e > 0, and determine 
the time it takes to reac h eN. We achieve this by using the supercritical Galton-Watson processes 
provided by Lemma 3.7 Recall = inf{/i > 0 : Ki > eN}. 

Lemma 3.13. Let 0 < e < l/y. Then we have as N —> 00 

g N log 7 7 


7-1 


(1 - e)(l + o(l)) < Pi(3i : Ki > eN) < QNl ° gl 7 - (1 + o(l)), (61) 

r 7 — 1 


and for any S > 0 


limsupP^Tf > q n 

N—>oo 


-1-5 1 


Tf < 00) < 


1 — e 


3.1 


with AT 0 = Kq = 


Proof. Let a £ (6,1/2) and let (iQ)ieN 0 an d (A"i)»g n 0 be defined as in Section 
Kq = 1. We write {Kf) reaches eN fo r the event t hat there exists i > 0 such t hat Kj > eN, a nd 
analogously for (ATJ, (Ki). By Remark 


3.11 


Lemma 


Appendix B.l 


and Lemma 


Appendix B.2 


Pi ((Ki) reaches eN) ~ Pi ((A/) survives) 


Qn log 7 7 
r 7 — 1 


and 


Let 


Pi((. K) reaches eN) ~ Pi ((Ki) survives) ~ g N ^ ——-(1 — e). 

r 7 — 1 

A := A(y,a,e,S,N) := {K t < Rf <K t \/i< min (if .g^ 1 " 4 )}. 

and applying the Bernoulli inequality we have 


(62) 

(63) 


Setting g := g 6 in Proposition 


3.7 


Pi(A c ) < 1 - (1- 2e 


—cN 1 — 20 


Qn 


— Bn 


-l-S 2e -cN 


(64) 


implying Pi (A) —> 1 exponentially fast as N —> 00 . Let := inf{i > 0 : K_i > eN}. Then 

Pi ((Ki) reaches eN) > Pi ((if,) reaches eN, (Ki) reaches eN,A,T^ < g )^ 1_<5 ) 
= Pi ((Ki) reaches eN,A,T^ < e)^ 1 ” 5 ) 

> Pi ((Ki) reaches eN,T^ < gf/- 6 ) - P(A C ) 

~ Pi ((Ki) reaches eN) 


(65) 


22 





















using (641 and Lemma B.3 in the last inequality. Together with (63) this proves the lower bound in 

_ — _v _ * — 

(J61J). For the upper bound, let T 0 := inf{i : AG = 0}. Note that 

Pi((AG) reaches eN) = P {{K iAT \) reaches eN) 


and 

Pi((A" iAT w) reaches eN) = 1 — ¥{{K iAT rt) dies out). 

Thus we have 


1 — Pi((A'j) reaches eN) > 

> 


Pi((A',:at») dies out) 

Pi ({K iAT N) dies out; {Ki) dies out; A;Tq < g ^~ 5 ) 

Pi {{Ki) dies out; A; Tq < gj/~ S ) 

P i{{Ki) dies out) 

1 — Pi((A'j) reaches eN), 


( 66 ) 


where we have used (B.l) from the Appendix and Lemma Appendix B.2 This implies the upper 
bound. 

We are thus left with proving the last statement of the Lemma. Fix S > 0. We have 

Pi(T 1 Ar > g~^~ S , {Ki) reaches eN, (FQ) survives) 


Pi(T 5 1 (AT,) reaches eN) =- 


Pi((FQ) reaches eN) 

Pi {Ti > g^- 5 , {Ki) reaches eN, ( K i ) dies out) 
Pi ((A',;) reaches eN) 


(67) 


By (65) and Lemma Appendix B.2 we have for large enough N the inequality 

Pi {{Ki) reaches eN) > P i ((A' i ) survives), 


and thus the first term on the right-hand side of (67) can be bounded from above by 

Pi (7? > gjf 1 - 6 | G Ki) survives) < Pi(T^ > g^~ 5 , A \ (KJ survives) + Pi(A c \{KJ survives) 


< Pi (Tf > g N 1 d | {K^ survives) + 


-1 — <5 


Pi(A c 


( 68 ) 


P((AT i ) survives) ’ 

The first term on the right-hand side converges to 0 due to Lemma | Appendix B.3| By Lemma 


Appendix B.l we have Pi ((AT*) survives) ~ cgw, therefore by (64) the second term on the right-hand 


side converges to 0 as well. Thus we have shown that the first summand in (67) converges to 0. To 
deal with the second term, we observe 


Pi(T ] /v > gjy 1 s , {Ki) reaches eN, (FQ) dies out) 

<Pi((Ai) reaches eN, {K t ) dies out) 

=Pi((AT.;) reaches eN, {K t ) dies out, (A'j) dies out) 

+ Pi ((AG) reaches eN, {K^j) dies out, (AG) survives) 

<Pi((AG) reaches eN, {Ki) dies out) + Pi ((AG) dies out, (A'j) survives) 
<Pi((AG) reaches eN,{Ki) dies out, A'^-ij > 0) 

+ Pi ((AG) reaches eN, {I<i) dies out, A'^-ij = 0) 

+ Pi((AG) dies out, (AG) survives). (69) 
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We have 


Pi((Af reaches EN,(Kf) dies out, A'^-ij > 0) < PfATf dies out,Af e -ij > 0) 


which goes to 0 exponentially fast due to (B.l I in the Appendix, and using Lemma Appendix B.2 
get 

Pi ((Ki) reaches eN,(Ki) dies out, A'^-ij = 0) < Pi(A c ) 


we 


3.12 


see Corollary 
the claim follows. 


1 — e 

Thus the second summand in 


which goes to 0 exponentially fast due to (64). Finally we have 

IM(iG) dies out, (Ki) survives) =Pi((A', ; ) dies out | (Ki) survives)Pi((A' i ) survives) 

<e(l + o(l))Pi((AA) survives) 

(1 + o(l))P 1 ((A) i ) survives), 

is bounded from above by jff 1 + o(l)), and 

□ 

Corollary 3.14. Let Tf := inf{i : Ki = 0}. For 0 < £ < l/y A 1/16 there exists Np^ such that for 
any k < eN, 

P k(T? A Tf > gfp- 5 ) < 1/2. 

Proof. Fix k < eN . We have 

P f Tf A Tf > g]p- s ) = P fc (Tf > ff" 5 |Tf A Tf = if )P fc (Tf A Tf = if) 

+ p*( rf > g]p- s \T? a Tf = Tf )P fc (Tf A if = Tf ). 


(70) 


Due to (57) we can see that all the steps leading to the last statement in Lemma 3.13 hold if the 
processes are started in k < eN instead of 1. Hence we have that for all 1 < k < eN 

lim sup Pfc (Tf > g]p~ s \TP < 00 ) < ——. 

IV —>00 J- £ 


(71) 


Moreover, if we stop (A/) with I\q = k < eN when the Markov chain is larger than eN, then (Ki) 
is an absorbing Markov chain with absorbing states 0 and any number larger than eN. That implies 
Pfc(lf A Tf < 00 ) = 1. Notice that under event {Tf ATf < 00 }, we have {Tf < 00 } = {Tf A If = 
Tf }. Altogether we obtain 


limsupPf Tf > ejy |Tf A Tf = Tf) < --(1 + o(l)), 

AT—>■ 00 £ 


(72) 


which is smaller than 1/4 for our choice of e. Therefore (70) holds for any k < eN such that PfTf > 
g-i-S\ T N A t n = t n^ < 1 j 4 Assume therefo re that P fc (Tf > ^"^Tf A Tf = Tf) > 1/4. Due 
to Proposition 3.7 and Lemma Appendix B.3 we then have that PfTf A Tf = Tf) > 1/4 for N 
large enough. For such k 

MT 0 N > Qn 1-4 I Tf A Tf = Tf) < Pfc(A7 -x-a > 0, A | Tf A Tf = if) + P fc (A c | Tf A Tf = if) 


< 4Pfc(AT> 0, (Ki ) ieN dies out) + 4P fc (A c ). 


Equation (58) implies 

4P fc (Af e -i-5j > 0, (Ki ) ieN dies out) < 4P k (K^-isj > 0, (A' i ) ieN dies out) + 4e(l + o(l)). 


By (64), P k (A c ) goes to 0 exponentially fast, and P k (K g -is > 0| (Ki) ie ^ 0 dies out) goes to 0 by 
(B.l). Thus if e < 1/16 the right-hand side of the above inequality is bounded above by 1/4, and we 
have completed the proof. 0 
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3.6. Second stage of the sweep 

Lemma 3.15. For e £ (0,1/2) let 1 — s' £ (s, 1). Then we have for any k > eN 

lim P k (3i:IU> [(1 - e')fVJ) = 1- 

TV—>■ oo 

Moreover, lim^v-^oo P{T^ — T 4 > gjj 1 ^ 8 ) = 0 for any 5 > 0, where we recall = inf{i : Ki > 
(1 — s)N}. 

Proof. We use an ODE approximation. Recall that Ki denotes the number of mutants at the beginning 
of day i. Let x £ [e, 1). From Corollary 3.9 we obtain that the expected number of offspring at the 
end of day * of a single mutant, given tha t there are [xiVJ mutants at the beginning of the day, is 
given by e^ r+ew ^ <T L xJV J. Using Corollary 3.9 we obtain 

E [Ki | i = [xN\] = Mi( e U+^)-L^j) = [ X N\ (1 + ^^1(1 - xN) + 0{g 2 N )). (73) 


From Lemma Appendix A.3 b) and Corollary |3.9| we see that there exists c = c(y,r) < oo such that 

var(AT.j | A/-i = k) < cN , k = 1,2,..., N. 

For / £ C 2 [0,1] we define the rescaled discrete generator of (/C)ieN 0 

A= gj?E[f(Ki/N) - f(k/N) | = As], * £ [0,1], 

Using Taylor approximation on / we infer that, for some y £ [0,1], 


A N f{^) = Qf , 1 


( E [( 




We have, 


E 4(§ - /:) 2 ] =^[JTf - E t [AVl 2 ] + /^[A'J 2 - 2T El .[/f 1 ] + (A) 2 

1 


=—= vax k (Ki) + (EklKxj/N - x)‘ 


N 2 

c 

: N 


(74) 


<i~r+o(e 2 


N)i 


where \0(g 2 N )\/ g 2 N is bounded uniformly in N and k. Hence recalling g^N 
continuity of f" on [0,1], we obtain the following convergence which is uniform in k and y : 

,K 1 k 


0 for a > b and the 


yG[0,l],fc=0,l, 


sup I^TV 1 ( E fe [(UvT _ KrY]f"(y)) \ 

,k=0,l,...,N \ iv lv / 


oo. 


Since E fc - 4 ] = h e (r+QN)vk _ k , one can ap piy Corollary 3.9 Together with the above display, 


we obtain 


sup \A N f(^-)~ |Ul- 4)/'(4)l oo. 


Applying Theorem 1.6.5 and Theorem 4.2.6 of da we infer that for every x £ [0,1], the sequence of 
processes (j^Ky e ~ i t j)t>o, N = 1 , 2 ,..., I< 0 = |yc-ZVJ converges locally uniformly in distribution to the 
deterministic (increasing) function g(t) which is defined by the initial value problem 


9 '(t) = g(t)(l - g(t)) 


logy 


g{ 0) = x £ [0,1]. 


Now choose f* such that git*) > 1 — s', provided g(0) = e > 0. This implies 

lim P(tf Le -i t>J > L(1 - e')N\\K 0 > LeiVj) = 1, 

N—> OO *4 

and a fortiori, limjv^oo P(T4 — Tf 1 > gf^~ S ) = 0 for any positive S. 


(75) 

□ 
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( 2 ) ( 2^ 

Corollary 3.16. For any e £ (0,1/2), there exist JV E U e N such that for every N > Ng , for every 
k > eN, 

„i -s 


fc(3* < off 6 : Ki > [(1 — e)N\) > 1/2. 


Proof. The proof follows immediately from (75). 


□ 


3. 7. Third stage of the sweep 

For the last stage of the sweep, after the number of mutants has reached at least (1 — e)N, we use 
a Galton-Watson coupling similar in spirit to the coupling at the first stage. The difference is that 
this time we will be working with the process of wild type individuals rather than the mutants. Fix 
again a £ ( b , 1/2). Let Qi := N — Ki be the number of wild-type individuals at the beginning of day i. 

to define approximating Galton-Watson processes (<5 )igp 


We proceed similarly as in Section 


3.1 


and 


(Qi)ieNo) f° r i € N constructing Q. and Q t recursively from the same Yule forest as Qi : Recall that 
the wild type individuals reproduce at rate r. Assume that Q , and Q. t _i are constructed, and start 


independent Yule trees growing at rate r for each individual as we did in Section 3.1 to construct K, 
and K_i. Assume Qi-i = q £ (0 ,eN). Grow the Yule trees until time er[(i-2e)AH and distinguish the 
individuals according to whether they were born before ctjv, before o\/v_ g , or before crr( 1 _ 2£ )jv'i ■ Taking 
the time of birth into consideration, the individuals born before cry will be sampled independently 
with probability 7 -1 — N~ a to form Q , born before cry_ g will be chosen according to (28) to form 
Qi , and those before <7[( 1 _2 E )jv] with probability 7 -1 + N~ a to form Q t . 


It is clear that Lemma 3.4 and Corollary |3.5| still hold, and thus we can prove the equivalent to 


Proposition 3.7 Define 


Tfj (m) := inf{i : > meN or Qi = 0}, m> 1. 

Lemma 3.17. Let a £ (6,1/2). Let m > 1, and 0 < e < 1/(7717). Assume Q n = Qo = Q 0 < eN. Then 
there exists c large enough such that for N large enough, 

^^Qmin{i,T”(m)} — Qmin{i,T£j (m)} ^ Q m in{’,T JV (m)} 1 ^ — 9^j — — ) 9 for all g £ No- 

for some constant c independent of N. 

Proof. This follows from a straightforward adaptation of the proof of Proposition |3.7[ since the con¬ 


dition e < 1/(7717) allows us to prove the analog of Lemma 3.6 observing that the definition of 
Tw ( m ) ensures that we stop the procedure if Qi reaches meN individuals (and not eN as in Propo- 

□ 


sition 


3.7). 


We have the alternative description corresponding to Proposition 


3.3 


(Qi)ieNo is the Galton- 

Watson process whose offspring distribution is mixed binomial with parameters W and 1 + N ~ a , 
where W is geometric with parameter e~ rrT ^ 1 - 2e '> N ^. Similarly, (<3 j; )igN 0 is the Galton-Watson process 
whose offspring distribution is mixed binomial with parameters W and - —N~ a , where W is geometric 


with parameter e r(TN . From this we obtain the analogue of Lemma 3.10 


Lemma 3.18. For (Q.)i e n 0 and (Qi)ieN 0 defined above there exist c, c independent of N such that 
for N large enough, 


Ei [Qi] = 1 - cg N + o(g N ) and E x [QJ = 1 - cg N + o(g N ) 
Proof. By construction, and from Corollary |3.9| 

Ex [QJ = (I/7 - N~ a )E[W} = (I/7 - N~ a )e raN 

= ( l / 7 — A r_ “)(7 — 0AT—-|p-) + o { q N ) 

1 lo S7 , 7 X 

= 1 - QN+o(Qn), 

7 r 


( 76 ) 
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where the last equality follows from the fact that our assumptions imply that N~ a = o(qn)- This is 
the first assertion in (76|. In the same way we obtain Ei[QjJ = 1 — cqn + o(qn) 7 for some positive 
constant c independent of N. □ 


Lemma 3.19. Let m > 1 and 0 < e < 1 /( 7717 ). Aor an y fc > (1 — e)N, 

lim supP k (Abe > Qn 1 ~ & ) ^ 2 / m 

N—> oo 


for any 5 > 0. In particular, P/.(3* : A/ = N) > 1 — 2/m. 

Proof. Under ¥ k we have by assumption that A'o = k > (1 — e)N, and thus Qo = N — k < eN. We 
consider (Qfjiz n 0 , (Q^ie N 0 as constructed at the beginning of this section, with a € ( b , 1/2). Let 

^ • Afj, CX, £, N, m) . ^Q m i n {i,T™(m)} — Qmin{i,T™(m)} — ^ m j n J.yr.v(m)} 1 ^ — AiV J’' 

Then Lemma (l. 171 shows 

P(A) —» 1 as N —» oo. 

Note that 

Ek[Q Le -i-ij] ~ (N - k){ 1 - cg N ) e » 1+S) < (N - fc)e“ Ee « < eNe~ Se ^ -► 0 

as N —> oo. Consequently, since on the event {T™ (m) > p// -5 } fl A we have Q^ g - i-«j > 1, 

P k(T£(m) > gj, 1 - 6 ) < P k{T^{m) > gf l 1 ~ s ,A) + ¥ k (A c ) < E k {Q [e - 1 -s j + ¥ k (A c ) 

< E fc [Q Le -i-jj] +Pfc(A c ) —» 0 as N —» oo. 

Since 


MriL > Q~n~ 5 ) =Mt£L > g-J-^T^m) > q£~ s ) + P*(t£ > t i£~* ,T% {m) < g^~ s ) 

<P k{T^(m) > g^ s ) +¥ k (Q T N {m) > emN), 

we are left with proving 

lirnsupP 7 c(Q T N( m ) > emN) < 2/m. (77) 

N—>oo 

Let k be the first time when (Qi)i >o is not less than emN or equal to 0. Note that under Ap\{Tff (m) < 
gfj 1 - 6 }, if Qr N (k) > emN, then necessarily, QT"( m ) — emN. So in conclusion: 

P k(Q T »(m) > emN,A,T^(m) < g^) < P k (Q K > emN,A,T»(m) < gf/- 5 ). (78) 

Notice that (Qi)i>o is, as a sub-critical Galton-Watson process, a supermartingale. Then (Q t A 
emN)i>o is a bounded supennartingale and, for any time strictly before k, these two supermartingales 
are the same. Now we have 

eN > Efc[Q 0 ] = Efc[Q 0 A emN ] > E k [Q K A emN] = P k (Q K > emN)emN. 


So __ 

P k {Q K > emN) < 1/m. 

Therefore using |78| ) we have for N large enough 

Pt(Qr»(m) > emN) < P k (Q K > emN) + P k {T™ (m) > £>^ 1_a ) + P(^4 C ) < 2/m. 

This implies ( |77| ), and moreover Pfe(3f : A", = N) = F k (Q T N^ k ' ) = 0) > 1 — 2/m. 

n 
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This result will be useful in the following simple form: 

Corollary 3.20. For every 0 < e < 1 /( 47 ) there exist ivj 3 '* £ N such that for all N > Ne 3 \S > 0 
and k > (1 — e)N 

> Q~n~ 5 ) < 1 / 2 . 


Proof. Take m > 4 in Lemma 3.19 


□ 


3.8. Proof of Theorem 2.9 


We are now finally able to prove Theorem 2.9 Let m> 4 and 0 < e < 1 /( 7717 ) A 1/16. By Lemma 
13. 131 we have 

tt n = Pi(3z : Iu = N)< P(K t reaches eN) < + 0 (i)). 

7 — 1 r 

Further, observe that for 1 < k < k' < l < N, by definition of the model, 

P*(tfi > 0 < Vk>(K 1 > l ) 

and therefore by induction Pfe(A/ > l) < Pfe/(A/ >l),i£ N. Thus 

Pfc((A”i) reaches 1) < Pfe/((A' i ) reaches l). 


Therefore, for every e £ (0, 1 /( 7777 ) A 1/16), by the strong Markov property and Lemma 3.13 


T/v >P[eArj (3z : ATj = N) • Pi(A' i reaches eN) 

>P LeiV j ( 3 i : Ki = N ) • ^ 11^(1 - e)(l + o(l)). 


From Lemmas 
Thus 


3.19 


and 


3.15 


7 — 1 r 

we obtain lim infjv->cx>P| e jv|(3i : Ki = N) > 1 — 2/m for any m > 2. 


'T — 1 T 'T — lr 

(1 — £■)(! — 2 / 777 ) < liminf - - ttn < limsup —-7 Tat < 1. 


7V->oo 7 log 7 


iV—>-oo 7 log 7 £»at 


Sending 777 —7 00 (and £->0) gives (19). 

Now we will prove that Pi(r w > gf, 1 ^ 26 ) < ( 7 / 8 ) e «\ Let fV e = sup{ N^\ ivj 2 \ } where 
N^, Ne 2 \ Ng 3 ' 1 can be found respectively in Corollary 3.14 3.16 and 3.20 Using the three corollaries 
and the strong Markov property of the process (ATi)j £ N 0 we know that for all N > N e , and for any 

fee { 1 , 2 ,..., 1 V} 

P k(r N < 3 g], 1 - 5 ) > (1/2) 3 . (79) 

Using the Markov property at time |~3£))y 1 "' 5 ], we see that for any n £ N 

N—l 

Pr (r N > 3 ngf, 1 - 5 ) < F^t" > (S^ 1 " 5 }) E P^r* > 3(n - l)^ 1 " 4 ^^-!-^ = k) 


k=1 


N—l 


< (1 - (1/2) 3 ) E > 3 (™ - l )Q~N~ S )^{K W -s A = k). 


k=1 


Thus, proceeding iteratively, and using the fact that (79) is uniform in k G {1,..., AT — 1}, we obtain 


> Sn^ 1 - 5 ) < (1 — (1/2) 3 )". 


In particular, choosing n = [f 7 JV ' 5 ] we obtain for <5 > 0 


i(r iv >^ 1 - 35 )Pi(r->3^--)<(7/8) 


— 1—2<5\ 


□ 
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3.9. Proof of Proposition \ 2.12 

Due to Theorem |2.9[ and due to the Assumption that the mutations arrive independently of each 
other at geometric times with parameter pn, we have that for any S' > 0 


P(mjv < t ) <1 — P(toat > Bn 


- 1 - 5 '! t n< b - 


■)P {t N <B~ n 1 - 5 ' 


<1 - (1- p N ) [e ” 5 j (1 - (7/8) /3j ). 


Now the Bernoulli inequality yields 


P (m N <t n )< 1 - (1 - - (7/8)L^' /3 J) 

= Bn Leiv 1-5 J + ( 7 / 8 ) L®* J - PnIbn 1-5 J ( 7 / 8 ) L®™ J. 

From this we obtain 

P(toat < t n ) < pnBn 1 S 

for any <5 > S', provided N is large enough. This proves the first claim. Now, let Ej be the event that 
there is no clonal interference until the day that the j-th successful mutation starts. Observe that 
IP(Ei ) is given by the probability that any unsuccessful mutation started before the first successful 
one has disappeared before the next mutation (successful or unsuccessful) starts. By the first part of 
this theorem, for any given mutation this is the case with probability ¥(itin > t n ) > 1 — pnB~n 1S , 
for S > 0. Denote by L the number of mutations until the first successful one. Since the mutations 
arrive independently of each other, we see by induction that for l £ No 

P ( no clonal interference in the first l mutations | L = l + 1) > (1 — PnB~n 1S ) 1 - 


By Theorem 2.9 L is (asymptotically) geometric with success parameter C(^)qn /tq. Thus summing 
over all possible values of L we obtain by Theorem |2.9| and the first part of this proof, for 5 > 0, 


P(£i) > ]Tp(L = / + !)(! -PNB~N~ S ) 1 


1=0 


> 


CirfBN /, -i-$y 

/ Z 1 ---) —-- 1 1 - PnBn > 


1=0 

C(l)BN 


ro 


r o 


— S(1 _ si)® _ +3 m ly 

ro ^ r ° r ° 


C(^)bn 


1 


-8 


ro C^bnVq 1 + PnBn S ~ C {l) r o ± PnBn 
o(pnBn 2 ~ 5 ) 


> 1 - PnBn 2 5 

for N large enough and 5" > 5. Fix n £ N. Similar to the previous calculation, for j < ngf, 1 , we have 
¥(Ej + i\Ej) > 1 — pnBn 2 S /3 + o(pnB~n 2 S )• Proceeding iteratively one thus observes that for any 
fixed n G N 


P(*W„|) > (1 -PnBn 2 - 6 "+o(p N BN 2 - S )) [ne " i 


\ i —3—3<5 /j 

^ 1 Ti^LjSf Qjy 


(i 


IN 

o(l)). 


(80) 


By Assumption A iii) this tends to 1 for 5" > 0 small enough. Let I n be the day at which the 
[g/ViJ-th successful mutation starts. We can write 


o =i 
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if /A denotes the time between the fixation of the j — 1 th and the initiation of the jth successful 
mutation (and 1^ = I\). Let ZA denote the number of unsuccessful mutations that happen du ring 


time /bb. The success probability of a mutation that happens during is according to Theorem 1^9 
given by C(y) ro +(j-i)eN ■ Therefore, conditional on Ej, lA is geometrically distributed with success 

parameter C( 7 ) ra +(j-i) eN ■ Moreover, conditional on Ej, the time between two of the lA unsuccessful 
mutations is stochastically larger than a geometric random variable with parameter /rjv, since this is 
the rate at which mutations arrive, and the geometric distribution is memoryless. Thus we see that 
the time /A is stochastically larger than a geometric random variable with parameter 
and a fortiori stochastically larger than , if (G^)j^ 0 is a sequence of independent geometric 
random variables with parameter C^hnQn/tq. Thus conditionally on Ey ne ~i^, stochastically I n > 

£jL? nJ Gf • Let n = \2Tr 0 /C(-,)l Then 

lim P(no clonal interference until gf^pCffT) > P(E,-i n ,, I n > (gJ^nffT)) 

N—too L un -I 

= > r^A T Pn ^1I-®Le^nj) 

LeivA 

> P(^-„j)(i-2P( £ Gf<L^Wrj)) 

i =1 

By Cramer’s large deviation principle the second factor tends to 1. Thus the statement follows 
from (|80|. □ 


3.10. Proof of Theorem \2.l3\ 

Denote by A the event that there is no clonal interference up to day i, that is, any mutation that 
starts until or including day i happens in a homogeneous population. Define 


Hi := Hi 1 D . - ool 


L Df 


1 <i<e~ N 2 »- N x T coincide 


2.12 


Thus 


Then we have for any T > 0 that the two processes (Hi) 1<i<g - 2 ^-i T and (Hi) 
on the event (D c _ 2 _ x ,), whose probability converges to 0 as N —> 00 , by Proposition 

it is sufficient to show that (Hy tg -i fi -i, )o<t<T converges in distribution to (M(C( 7 )t/r 0 ))o<t<T w. r. 
to the Skorokhod topology, cf. Theorem 3.3.1 in m- This will be achieved by a standard generator 
calculation. 

The process (ZZAeNo is a Markov chain on No U {— 00 } with the following transition probabilities: If 
n > 0 , then 


P (A+i =n + l\Hi=n)= C ^ )mgN P(D l+1 \ Di), 


P(H i+ i = n | Hi = n) = (l - 


r 0 + ng N 

C(^)hnQn 


r 0 + ng N 

P(Hi+i = -00 | Hi = n) = P [D c i+1 | Df), 


i +1 


I A), 


and 

P(A+i = -00 | Hi = - 00 ) = 1 . 
Observe first that for any S > 0 we have 

p(A+iIA)<mW 5 - 


(81) 


This follows since conditional on the event Di, the event A+i can only happen if at day i + 1 a new 
mutation happens, and interferes with the previous mutation. The probability that a new mutation 
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happens is given by gw, and the probability of interference of a pair of mutations is P (tun < t n ). 


Thus (811 follows from Proposition 2.12 


For bounded functions g on No U {—oo}, the discrete generator of (A)ieN 0 on the time scale i = 
q- n ^- n H is given by (cf. Theorem 1.6.5 of [TO] ) 

Bwgin) := —-— E[g(H i+1 ) - g(n) \ Hi = n] 

BnPn 


1 / C (7 )h n qn 


P(A+i | Di)(g(n + 1 ) - g{n)) + P(A+i I A)(ff(-oo) - g(n))) 


BnPn ^ r 0 + ng N 

: P(A+i I Di)(g(n + 1) - g(n)) + P ^ £> »+ 1 1 RA ( g (-oo) - 5 (n)). 

r 0 + TlQN BNBN 


Due to (81) and Assumption A iii), the r.h.s. converges as N —>• oo to 

<7(7), 




-(ff(n + l)-flW), 


which is the generator of the Poisson process (M(C (y)f/r 0 )) t > 0 - By Theorem 4.2.6 of [TO] this implies 
convergence of the corresponding processes. □ 

3.11. Convergence of the fitness process 

Proof of Theorem \2.1f\ We proceed analogously to the proof of Theorem |2.13| Define 

:= 1 + —Hi, 
r o 

and recall A = 1 +^H,. As above, observe that the two processes an d (^j)i<i< e - r 2 M - lr 




coincide on the event D c _ 2 _ x , whose probability converges to 0 as as N —► oo, and that 
I @N 1 I 

is a Markov chain with transition probabilities 


p(i, +1 = x + “ i i. = x) = mmm P(D 

r 0 


P(l> i+ i = x | A = x) = (l - 


xr o 

Cfi))p,NQN 


i+1 I Di), 


xr o 


)p(A+i| A), 


P($ J+1 = -oo IA = X) = P(A +1 1 A), 


for a; > 0 and 


P($,; + i = -OO | A = -OO) = 1 . 

Thus the discrete generator of ($j)j e jj 0 on the time scale i = gfj 2 iif^t is given by 


A N g{n) :=-2- E b(^i+i) - g(x) I = x\ 

BnTn 


1 ( Cfiy)ii N g N 


B n Tn ' xr o 
C( 7) 


P(A+i | Di)(g(x + ^) - ff(a;)) + P(A+i I A)(ff(-oo) - 5 ( 2 :))) 

(ff(-oo) -g{x)). 


Bn , E (A+i | A), 


-P(A+i | Di)(g(x -t - )-g(a;))+ 2 

BnTqX r 0 Bn Bn 


Due to (81) and Assumption A iii), the r.h.s. converges for a continuously differentiable function 
g : K. — ► K that vanishes at 00, as iV —> 00 to 

Ag{x) := ^j^-g'ix) 
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as N —> oo (as can be seen from Taylor’s expansion, compare the proof of Lemma 3.15). This, in 
turn, is the generator of the solution to the (deterministic) differential equation 

hit) = t > 0, 


h{t) 


_2 ! 1 — 


whose solution (for the initial value h{ 0) = 1) is /. So we can apply Theorem 4.2.6 in [TO] to conclude 
that $ and then $ converges in distribution to (/(f))t>o in the Skorokhod topology. Convergence of 
F follows from the relation (23). Since / is continuous, this amounts to locally uniform convergence 
in distribution. □ 


Proof of Corollary \2.15\ The proof is as for Theorem |2.14[ with the only difference that now we replace 
<f> by <f>^, with transition probabilities for x > 1 


t&)QN \ 


C{"i)HNBN^ix) . 


xr 0 


i +1 I 


P($f +1 = x + \<f>? = x)= | D 

Ci^UNQNlfix 


P($t +1 =x\$t= X )=(l- 


xr o 


P(-Dj+11 Di), 


= = x)=nDz +1 \D i ), 


for x > 0 and 


F ($t-1 = -oo I 4>f = -oo) = 1 . 

which leads to a slightly different discrete generator 


4 , C{i)${x) , nl/ , , ip{x)g N ^ f ^ , P(Di+i\Di) 

A n 9 {x) = - P(A+i \Di){g{x-\ - )~g{ x)) + -2- S -oo -SW • 

g N r 0 x " " 


i’jx)g N , 
ro 


Q%hN 


Thus we get 


lim A%gix) = g'(x) 


N-* oo JVyV ' VqX 


and we conclude as above. In particular, solving 

C( 7) 1 


hit) = 


Tq hit) 2q+1 


yields (25). 


□ 


Appendix A. Basics on Yule processes and proof of Theorem 2.5 


Definition Appendix A.l (Yule process). A Yule process with rate r is a continuous-time Markov 
process taking values in N such that the transition rates are given by: 


n —> n + 1 at rate rn 
n —» others at rate 0. 


Remark Appendix A.2. Consider a population model starting with no individuals, where each indi¬ 
vidual reproduces independently at rate r by splitting into two individuals. Then counting the total 
number of individuals, one gets a Yule process. This is the population model which we consider in 
the Lenski experiment during one day, with starting population size no = N. 

The next lemma is well-known. For part a) see e.g. [3], p. 109; part b) is due to the indepencence 
of the branching. 
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Lemma Appendix A.3. Let Z r be a Yule process with rate r. 

a) If Z r { 0) = 1, then, for t > 0, Z r (t ) follows a geometric distribution with parameter e~ rt . 

b) If Z r ( 0) = no £ N, then Z r (t) follows a negative binomial distribution with parameters no and e~ rt 
In particular, 


E [Z r (t)] = noe rt , and var (Z r (t)) = e rt (e rt — l)no- 
The next lemma shows that Qv is asymptotically equal to a. 

Lemma Appendix A.4. Let Qv o,nd a = <7q be as defined in (Jsj) and (13). Then 


(d) 

Sn -t cr- 


Proof. During one day in the Lenski experiment, consider the population consisting of N subpopu¬ 
lations each of whose sizes follows an independent Yule process with parameter r. Let Z r N (t) denote 
the size of total population at time t. Then Z^(t) is the sum of N i.i.d geometric variables with 
parameter e~ rt . Let e > 0. Then due to the law of large numbers 

p , Z r N (a-e) N—^oo p , Z r N {a + e) . ^ 

v jN ’ v yTV ’ 


Therefore P(cr — e < Cat < a + e) 


N—too 


1. Since £ can be arbitrarily small, the lemma follows. 


□ 


Proof of Theorem 


2.5 


This is a direct application of Theorem 2.1 in 


Fix a generation 


in the Cannings model and let cjv be the probability for a pair of individuals to be coalesced in the 
previous generation and d^ the probability for a triple of individuals to be coalesced in the previous 
generation. Then it suffices to prove that 


N—too , / N—^oo ~ 

cjv ~^ U, ajv/cjv — > (J. 


(A.l) 


Notice that CN,d]\r do not depend on the generation since the reproduction, sampling and labeling in 
each day do not depend on the past and on the future. Therefore we can consider a typical day (the 
population at the beginning of a day constitutes a generation) and take the notations at the beginning 
of Section 2.1.1. Let Yf be the size of the family of individual i at time t. Then 

Zt =Y t 1 +Y t 2 + ..-Y t N , 


with (Y^)i<j<jv identically and independently distributed as a geometric distribution with parameter 
e~ rt . The day ends at time a = and notice that the population for the next day will be chosen 
uniformly, hence one can express cjv, d/v as follows: 


cn = E[ 


Etr ft) - 

(T) 


2 ( 1 -i) Y N , ( Y °) 

1 -,d N = E[^ i=lUj 


N 


(Y) 


] = 0(A" 2 ), 


which gives (A.l), and thus completes the proof. 


□ 


Appendix B. Properties of near-critical Galton-Watson processes 

The following lemma (Theorem 3 of [2], and see also Theorem 5.5 in m under weaker conditions) 
provides the survival probability for certain near-critical Galton-Watson trees. 

Lemma Appendix B.l. Consider a sequence of supercritical Galton-Watson processes (Gf )i^ 0 , 
N = 1, 2, ..., with offspring mean 1 + (3n (with (3 n —> 0J and offspring variance o 2 +v n (with ujv —t 0J 
and uniformly bounded third moment, starting from one ancestor in generation 0. Then the survival 
probability (j>^ obeys 4 >n ~ ^§t~- 
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Lemma Appendix B.2. Let (G- v ),; e f;} 0 • N = 1,2,... be as in Lemma Appendix B.l Assume that 
PnN —» oo as N —» oo. Then, for every e > 0, P(3i : Gf*' > elV) ~ P(limj_ >0O Gf = oo). 

Proof. Again let (f>N be the survival probability of G N started in one individual. Then 


P( lim Gi = oo|3i: G, > eN) > 1 — (1 — </>iv) 


eN 


1 - (1 - 2 %) £iV 
<7 Z 


1, N - 


□ 


Lemma Appendix B.3. Let (Gf r ) ig N 0 , N = 1,2,... be as in Lemma\Appendix B.ll Assume that 
Pn ~ cN~ b , N = 1,2,..., for some c > 0 and b € (0,1). For fixed e £ (0,1), let ujpf := inf{i > 0 : 
G f > eN}. Then we have for any 6 > 0 

lim Pi(w/v > Pf, 1 - 6 Iwjv < oo) = 0. 

N—toc 

Further, let Vn '■= inf{i > 0 : Gf 1 = 0}. Then for any 6 > 0, for N large enough, 

Pi(fjv > Pn X ~ S \v n < oo) < e~ NbS . (B.l) 

Proof. First we consider the difference between conditioning G N on survival (forever) and on reaching 
eN, respectively. Since we know (from Lemma Bl) that 


Pi(G JV survives) 


2/3 jv 


dN~ b , 


(B.2) 


we can infer, using the strong Markov property, that 

Pi(G JV reaches eN and G N does not survive ) < P[ e ivj ( G N does not survive ) 

= (1 - </> N ) [sNi < (1 - dN~ b ) LeJVJ < exp(— c(e)A 1_b ). (B.3) 


Thus we can estimate 

Pi(u>iv > Pf^ l ~ s \G N reaches eN) = 

1 

< 


1 


Pi(wjv > P N X S ,G N reaches eN) 


Pi(G JV reaches eN) 
1 


Pi(G Ar survives ) 


Pi (Gat reaches eN) 

Pi (G n reaches eN and does not survive) 
Pi(wiv > Pf^~ 5 ,G N survives). 


The first summand on the r.h.s tends to 0 as N —> oo because of (B.2) and (B.3). Thus, for proving 
the lemma it suffices to show that 


lim Pi(wat > Pm 5 |G w survives ) = 0. 

N—too 


(B.4) 


Let cf>N be the survival probability of G N , and denote by H-*, i = 0,1,..., the generation sizes of 
those individuals that have an infinite line of descent, conditioned on survival of G N . Then we have 
(cf. Proposition 5.28 in [2D] ) 

/» := ^ S fc P!« = k) =Ei[ S <] = m~0N+<l>Ns) G ?}-(l-dN) ' S > 0 

k> 0 
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Obviously, Pi (Hf = 0) = /*(0) = 0 and P fHf = 1) = (/*)'(0) = E[Gf (1 - <M G " _1 ], which, using 
Taylor expansion, is transformed to 


E[Gf(l-(Gf -1)^ + 


(Gf-l)(Gf-2)^ 




= Ei [Gf (1 - (Gf - 1 Wn)} + 0(P N ) = 1 -p N + o(( 3 n ), 


(B.5) 


where t = t(Gf ) € (0,1). The first equality is due to the assumption in Lemma Appendix B.l that 


the third order moment of Gf is uniformly bounded. We can thus infer that, for any fixed rj £ (0,1) 

Pi (Hf > 2) > rjpN, when N is large enough. 

We can now give a lower bound for Gf, conditioned on survival of G N , in two steps: first by Hf . and 
then by a (discrete time) Galton-Watson process with offspring distribution (1 — t]I3n)5i + vPn$ 2 - Call 
this process B N . With generations as a new time unit, the sequence of processes B N converges, 

as N —>■ oo, to a standard Yule process. This means that, for every fixed t> 0, at a time of L^/SatJ -1 
generations, B N has an approximate geometric distribution with parameter e -t . Thus we conclude 
after [_AvJ -1-<5 generations, B N (and a fortiori also G N when conditioned to survival) is larger than 
eN with probability tending to 1 as N —> oo. This shows (B.4), and concludes the proof of the first 
statement. 

For the last statement, observe that by Theorem 5.28 of ED] the distribution of (Gf) conditioned on 
extinction is equal to the distribution of a Galton Watson process with probability generating function 


f(s) := (1 - <M _1 E^ 1 - Ms) k Pi(Gf = k). 


k>0 


Thus we have 


Ei[Gf |G* dies out] = /(1) = E[Gf(l - ^) G ” _1 ] = 1 - 0n + o(p N ), 


where the last equality follows from equation (B.5). Then, by Proposition 5.2 in [20] we observe that 
Ei[Gj^_i_ 5l (G^ dies out] = (1 - (3 N + o(/3j V )) /3 « < e~ N (B.6) 


so we conclude 

Pi(vjv > |wjv < oo) = Pi(G^-i-j. > OlG^ dies out) < Ei[Gfl_i_ 5l IG^ dies out] < e~ N . 

I Pn I I Pn I 


□ 
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